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This  paper  describes  an  architecture  for  an  information  manager  that  is  at  the  core  of  a 
sensor-based  autonomous  system.  The  architecture  provides  the  means  by  which  sensor- 
based  data  can  be  integrated  with  stored  knowledge  to  provide  the  information  needed  for 
autonomous  behavior.  The  overall  architecture  can  be  viewed  as  a  community  of  indepen¬ 
dent  processes,  each  of  which  interact  with  an  active  database  whose  structure  mirrors  that 
of  the  three-dimensional  world. 


1  Introduction 

This  paper  describes  the  architecture  of  an  information  man¬ 
ager  that  is  at  the  core  of  a  sensor-based  autonomous  system. 
The  architecture  accomodates  data  from  a  wide  variety  of 
sensors  as  well  as  from  other  sources  of  stored  knowledge. 
Each  source  of  information  imposes  its  own  constraints  upon 
the  system  design;  however,  our  design  decisions  were  moti¬ 
vated  by  the  requirements  for  organizing  visual  data.  More 
general  sensory  data  includes  visual  data,  but  the  informa¬ 
tion  that  all  sensors  capture  is  information  about  the  visual 
world  —  information  that  is  comprehendahle  only  in  terms 
of  the  visual  world.  The  task  of  organizing  this  informa¬ 
tion  is  the  task  of  providing  a  framework  for  data  about  the 
world,  whatever  the  data  source  (sensor  type).  The  visusd 
world  and  the  means  for  structuring  data  sensed  from  that 
world  are  thus  the  subjects  of  this  paper. 

Current  models  of  machine  vision  describe  the  vision  pro¬ 
cess  as  a  series  of  tasks  that  convert  the  visual  signal  into 
a  set  of  symbols  that  characterize  the  entities  in  the  scene. 
The  series  of  tasks  from  signal  to  symbol  is  commonly  de¬ 
scribed  by  vision  researchers  as  moving  from  low-level  to 
high-level  vision.  Low-level  vision  processes  the  input  signal 
to  find  features  of  the  signal  such  as  image  structure,  e.g. 
intensity  edges,  or  features  of  the  three-dimensional  scene, 
e.g.  surface  shape.  High-level  vision  is  more  cognitive  in  na¬ 
ture:  it  usually  assumes  that  objects  in  the  scene  have  been 
delineated,  and  that  the  task  at  hand  ia  to  use  world  knowl¬ 
edge  and  reasoning  techniques  to  recognize  the  objects  and 
to  determine  the  relationships  among  them.  Intermediate- 
level  vision  has  the  role  of  converting  low-level  features  into 
objects  suitable  for  higher-level  processing.  Intermediate  vi- 
sion  converts  image  features  into  scene  entities,  signals  into 
symbols. 

In  the  past,  machine  vision  research  has  concentrated  on 
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both  low-level  and  high-level  vision.  Intermediate-level  vi¬ 
sion  has  received  less  attention,  not  because  it  is  unimpor¬ 
tant  hut  because  we  needed  to  understand  first  wbat  could 
be  extracted  from  images  and  what  we  could  do  with  scene 
objects  if  we  were  able  to  find  them. 

At  the  heart  of  the  intermediate-level  vision  task  is  the 
temporal  difference  between  low  and  high  level  vision.  The 
visual  signal  we  receive  and  its  features  are  transitory  in  na¬ 
ture.  They  exist  for  a  short  time  before  they  change;  nett- 
features  appear  and  old  disappear.  However,  the  objects  of 
high-level  vision  have  a  continuity  of  existence.  They  ex¬ 
ist  when  they  are  not  viewed.  They  exist  tt-ben  there  are 
no  features  in  the  current  signal  to  expose  them.  Because 
intermediate-level  vision  must  map  transitory  image  features 
into  scene  entities  that  demonstrate  this  continuity  of  exis¬ 
tence,  it  is  the  output  of  intermediate-level  vision,  rather 
than  its  input,  that  has  temporal  consistency.  As  a  conse¬ 
quence,  intermediate-level  vision  because  of  its  very  nature 
must  use  previous  output  as  input,  as  well  as  the  features 
of  the  signal,  if  it  is  to  maintain  temporal  consistency.  It 
must  supplement  the  results  of  signal  processing  w  ith  stored 
knowledge  of  the  world  —  knowledge  of  the  nature  of  ob¬ 
jects  in  the  world,  knowledge  of  the  constraints  imposed  on 
objects  by  physical  reality  —  knowledge  that  learning  may 
provide  in  biological  systems,  but  that  must  be  otherwise 
supplied  in  less  accomplished  systems. 

An  autonomous  vehicle  exptores  a  world  that  has  persis¬ 
tence.  Objects  encountered  exist  when  they  are  no  longer 
in  view.  Moving  about  in  the  world  requires  knowledge  of 
the  environmental  continuity.  To  permit  autonomous  move¬ 
ment,  a  vision  system  must  be  able  to  map  transitory  image 
features  into  persistent  scene  entities,  the  task  that  is  as¬ 
signed  to  intermediate-level  vision  processes.  Because  an 
autonomous  vehicle  must  deal  with  continuity  of  existence, 
we  must  address  the  problems  of  intermediate-level  vision  in 
a  way  that  has  not  been  attempted  in  most  previous  vision 
tasks. 
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In  addition  to  their  temporal  differences,  the  various  lev¬ 
els  of  vision  processing  can  be  characterized  in  other  ways. 
Low-level  vision  processing  usually  requires  a  fixed  set  of 
input  on  which  to  carry  out  its  operations,  and  makes  lit¬ 
tle  use  of  other  information  that  is  available.  Because  a 
few  image  measurements  do  not  characterize  scene  entities, 
intermediate-level  vision  must  have  available  many  inputs 
to  mix  and  match  as  is  necessary.  A  flexible  architecture  is 
required  so  that  the  results  of  other  processing  are  available 
for  input  to  the  intermediate-level  vision  process.  Previous 
results,  such  as  confirmation  of  previous  hypotheses,  are  the 
inputs  necessary  for  mapping  signal  to  symbol. 

Low-level  vision  processing  is  usually  independent  of  the 
task  at  hand.  The  zero-crossing  operator  is  the  same 
whether  we  are  looking  for  roads  or  houses.  High-level  vision 
and  intermediate-level  vision  are  task  dependent.  To  clas¬ 
sify  the  land  cover  of  the  terrain,  we  must  know  the  task  at 
band  before  we  can  determine  the  appropriate  classification. 
If  we  wish  to  determine  whether  the  ground  can  support  an 
autonomous  vehicle,  we  do  not  need  to  know  the  grass  type. 
However,  if  our  task  is  to  estimate  wheel  slippage,  then  de¬ 
tails  like  grass  type  are  important.  Task  dependence  has 
always  been  a  feature  of  high-level  vision,  but  there  we  have 
symbols  to  work  with.  The  utility  of  intermediate-level  vi¬ 
sion  can  be  substantially  increased  if  task  dependence  can 
be  moved  below  the  symbolic  level. 

Intermediate-level  vision  must  have  flexible  access  to 
many  sources  of  information  if  it  is  to  produce  results  that 
are  reliable  and  temporally  consistent.  The  careful  design  of 
an  architecture  to  supply  the  various  data  is  a  prerequisite 
to  building  an  autonomous  vision-based  system.  The  objects 
that  intermediate-level  vision  deals  with  and  the  results  it 
produces  are  not  the  quantitative  objects  of  low-level  vi¬ 
sion  nor  the  symbolic  objects  of  high-level  vision,  but  rather 
the  qualitative  descriptors  that  interpolate  from  quantitative 
signal  to  symbolic  objects.  Intermediate-level  vision  must  in¬ 
tegrate  the  top-down  approach  of  bigb-Ievel  vision  with  its 
bottom-up,  low-level  counterpart.  High-level  models  must 
be  rendered  and  matched  against  image  data  as  symbolic 
information  is  converted  to  iconie,  while  attributes  of  image 
data  must  be  identified  and  classified  when  iconic  data  are 
mapped  into  symbolic.  Tbe  knowledge  system  architecture 
described  here  seeks  to  provide  a  base  on  which  various  and 
varied  approaches  to  machine  vision  may  be  explored. 

2  Vision  System 

It  is  impracticable,  with  today’s  technology,  to  implement 
the  activities  of  a  vision  system  in  a  sufficiently  eomplex 
monolithic  algorithm  that  can  cope  with  the  irregularities 
and  imperfections  of  the  outdoor  world.  For  this  reason,  tbe 
overall  arebiteeture  of  our  vision  system  can  be  viewed  as 
a  community  o{  interacting  processes,  each  of  which  has  its 
own  limited  goals  and  expertise,  but  all  of  whieb  cooperate 
to  achieve  tbe  higher  goals  of  tbe  system.  The  various  pro¬ 
cesses  may  represent  sensors,  interpreters,  controllers,  user- 
interface  drivers,  or  any  other  information  processor  that 
can  be  imagined.  Each  process  can  be  both  o  producer  of 
information  and  a  consumer.  Information  is  shared  among 


processes  by  allowing  them  to  read  data  stored  by  other  pro¬ 
cesses  and  to  update  that  information.  Each  process  con¬ 
tinually  and  asynchronously  updates  information  based  on 
sensor  readings,  deductions,  renderings,  or  other  interpreta¬ 
tions  that  it  makes. 

Each  process  is  a  knowledge  source  that  brings  its  exper¬ 
tise  to  the  processing  of  the  data  tbat  represent  the  known 
state  of  the  world.  These  processes  span  the  range  from 
low-level  image  processing  to  symbolic  manipulation,  and 
their  output  will  be  available  for  use  by  all  other  knowl¬ 
edge  sources.  Symbolic  information  may  be  used  to  set  the 
parameters  in  an  image-processing  procedure,  while  image 
properties  like  texture  may  be  used  to  confirm  a  deduced  ori¬ 
entation  of  a  supporting  surface.  The  type  of  information 
that  needs  to  be  shared  is  enormously  varied.  The  datahase 
that  stores  this  information  must  he  ahle  to  accept  this  vast 
assortment  of  data  types  and  make  it  available  to  requesting 
processes. 

Our  system  includes  a  global  database  through  which  in¬ 
formation  is  shared.  Because  all  processes  share  informa¬ 
tion,  the  communication  bandwidth  between  this  database 
and  the  various  processes  is  of  concern.  If  the  granularity  of 
the  information  to  be  shared  is  too  fine,  then  the  communi¬ 
cation  channels  will  be  overloaded  with  an  enormous  num¬ 
ber  of  transactions,  each  of  which  involves  small  amounts 
of  data,  while  a  granularity  that  is  too  coarse  requires  com¬ 
plex  knowledge  sources  that  sue  beyond  our  ability  to  con¬ 
struct.  We  view  the  knowledge  sources  as  substantial  enti¬ 
ties  that  attempt  to  share  data  objects  tbat  are  composite 
in  nature.  For  example,  we  do  not  expect  that  an  image- 
processing  routine  would  write  intensity-edge  information 
into  the  database,  but  rather  that  it  would  share  conclusions 
about  the  three-dimensional  objects  tbat  are  in  the  world. 
Of  course,  these  three-dimensional  objects  will  not  be  iden¬ 
tified,  nor  will  they  be  tbe  final  partitioning  of  the  scene  into 
world  objects,  but  they  will  be  entities  with  which  other  pro¬ 
cesses  ean  associate  parameters  and  semantics.  This  does 
not  mean  that  the  database  contains  only  symbolic  objects, 
but  rather  that  it  contains  objects  that  have  some  semantic 
character,  such  as  a  horizontal  planar  surface  with  approxi¬ 
mately  constant  albedo.  There  sue  fewer  transactions  within 
the  system,  but  each  is  associated  with  a  significant  amount 
of  data. 

Some  knowledge  sources  may  need  to  communicate  with 
others  at  a  level  that  is  not  provided  by  the  global  database. 
Such  communication  is  private  to  those  sources,  and  im¬ 
plementation  is  the  responsibilty  of  the  designers  of  those 
processes.  This  level  of  information  sharing  often  entails  a 
certain  computational  speed  requirement  and  usually  a  pro¬ 
cessing  sequence  that  can  be  prespecified.  Although  any 
system  that  interacts  with  a  eomplex  world  may  use  this 
form  of  close  coupling  between  certain  processes,  we  have 
tried  to  focus  on  the  problems  of  sharing  information  that 
is  of  a  higher  level  and  is  substantially  unstructured. 

If  processes  are  to  communicate  through  tbe  database, 
tbe  language  of  communication  must  be  rich  enough  to  al¬ 
low  items  to  be  shared.  Relevant  information  extracted  from 
the  database  is  of  little  use  if  the  receiving  process  cannot 
understand  it  or  make  use  of  it.  With  the  diversity  of  infor¬ 
mation  that  is  available,  we  choose  to  share  that  informa- 
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TRANSACTION 


RESPONSE 

Figure  I:  Transaction  Processing.  The  request  parser 
determines  what  must  be  retrieved  from  store,  while 
post-processing  of  the  retrieved  items  occurs  hefore  infor¬ 
mation  is  returned. 

tion  through  semantic  labels  that  classify  the  information  in 
the  database.  These  labels  must  reflect  the  multiple  levels 
of  specificity  inherent  in  the  information  itself.  The  lahels 
form  a  vocabulary  describing  the  information  that  is  stored 
in  the  database.  Accessing  information  by  means  of  the  se¬ 
mantic  label  allows  processes  to  be  independent  of  the  par¬ 
ticular  data  syntax  used  to  store  the  information.  We  allow 
database  access  through  logical  combinations  of  the  seman¬ 
tic  labels,  as  well  as  procedural  definitions  to  be  passed  to 
the  database  so  that  a  user  may  supplement  the  vocabu¬ 
lary  with  additional  terms.  Passing  procedural  definitions 
to  the  database  also  reduces  the  communication  bandwidth 
otherwise  needed  to  return  the  results.  Figure  X  shows  this 
view  of  transaction  processing,  in  which  the  database  has 
been  passed  a  request  that  necessitates  the  request  parser 
to  determine  what  must  be  retrieved  from  store,  while  post¬ 
processing  of  the  retrieved  items  occurs  before  information 
is  returned  to  the  requesting  process. 

A  system  that  views  processes  as  individual  experts  that 
may  make  conflicting  interpretations  of  the  data  must  have 
a  policy  to  determine  what  is  stored  in  the  database.  For  ex¬ 
ample,  if  two  processes  determine  the  height  of  a  particular 
tree  to  be  substantially  different,  whose  opinion  should  he 
stored:  the  last  one  given,  that  of  the  process  with  more  ex¬ 
pertise,  or  the  average  of  the  two?  There  is  no  “correct’  way 
to  determine  a  single  value.  Traditionally,  information  inte¬ 
gration  has  been  accomplished  when  the  data  are  inserted 
into  the  database,  and  the  data  that  are  then  stored  are  ex¬ 
pected  to  be  conflict-free.  In  our  system,  all  processes  are 
considered  equal,  and  only  their  opinions  are  stored.  This 


approach  reflects  the  view  that  conclusions  are  a  function 
of  the  data  used,  the  knowledge  sources  that  provide  that 
data,  and  the  anticipated  use  of  the  conclusion.  The  user  of 
information  should  have  the  opportunity  to  filter  that  infor¬ 
mation  with  knowledge  of  hoth  its  content  and  its  source. 
Information  in  our  data  store  can  be  modified  only  by  the 
process  that  created  it,  although  other  processes  can  cast 
their  opinions.  To  emphasize  the  contrast  with  conventional 
databases  we  therefore  view  our  data  store  not  as  a  datahase 
,  but  as  an  opinion  base. 

The  opinion  base  stores  information  in  the  form  of  opin¬ 
ions  from  the  system  processes  ahout  the  domain  of  interest. 
Another  form  of  information  that  is  critical  to  the  perfor¬ 
mance  of  the  overall  system  is  the  knowledge  used  hy  the 
various  knowledge  sources.  Should  that  knowledge  he  stored 
in  a  database  (possibly  the  one  used  to  store  domain  knowl¬ 
edge)  or  should  it  be  encoded  within  the  knowledge  sources? 
In  some  processes,  particulary  low-level  image  processing, 
the  flow  of  control  is  well  known,  and  efficiency  issues  re¬ 
quire  that  the  knowledge  he  embedded  in  the  procedures; 
however,  other  processes,  particularly  goal-driven  ones,  gain 
flexibility  of  control  if  the  knowledge  is  separate  from  the 
engine  that  applies  that  knowledge.  Because  we  expect  a 
variety  of  processes  to  he  used  in  intermediate-level  vision, 
we  have  selected  a  strategy  in  which  there  is  no  global  repos¬ 
itory  for  the  knowledge  used  by  the  various  processes.  Each 
process  is  free  to  determine  its  own  knowledge  representa¬ 
tion.  We  have  a  future  interest  in  having  processes  that 
modify  the  ways  in  which  other  processes  operate,  perhaps 
hy  generalizing  the  rules  they  use.  We  thus  see  a  distinct 
advantage  in  building  processes  in  which  the  knowledge  used 
by  each  process  is  encoded  in  a  form  suitable  for  modifica¬ 
tion  by  external  processes. 

Any  system  that  consists  of  a  collection  of  independent . 
asynchronous  processes  must  have  a  control  mechanism  that 
coordinates  these  processes  to  achieve  the  system's  goals.  In 
our  system,  each  process  is  continually  active,  going  about 
its  task  of  processing  the  data  that  define  the  current  state 
of  the  world  and  placing  its  opinions  in  the  datahase.  When 
certain  combinations  of  data  occur,  we  must  be  able  to  in¬ 
terrupt  particular  processes  and  have  them  deal  with  this 
new  information.  We  use  a  daemon  approach  to  implement 
this  strategy.  Daemons  are  placed  in  the  datahase  by  the 
processes  that  should  he  informed  when  particular  events  oc¬ 
cur,  and  the  processes  are  responsible  for  determining  how 
to  proceed  when  they  are  interrupted  by  these  daemons. 
Control  by  means  of  the  database  is  therefore  data  driven. 
Alternatively,  any  process  is  free  to  call  procedures  that  arc 
imbedded  within  another  process,  thus  allowing  control  to 
he  passed  by  procedure  emit. 

Control  that  is  data  driven  is  unlikely  to  be  coordinated 
to  achieve  the  goals  of  the  system  if  those  goals  are  not 
available  to  the  various  processes  that  are  performing  the 
data  transformations.  As  important  part  of  sensory  inte¬ 
gration  is  planning  which  activities  will  contribute  to  the 
more  general  goals  of  the  larger  system  id  which  the  sen¬ 
sory  system  is  embedded.  In  our  case,  we  interface  with  the 
goals  of  a  planning  system  that  controls  the  activities  of  an 
autonomous  vehicle.  A  planning  system  is  viewed  simply 
as  another  process  or  set  of  processes  that  may  access  the 
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database.  The  list  of  tasks  that  the  vision  system  is  aitemp- 
ing  to  achieve  serves  as  data  that  individual  processes  must 
use  to  prioritize  their  own  activities.  Conclusions  and  data 
transformations,  no  matter  how  correct  or  clever,  arr  irrel¬ 
evant  if  they  are  unrelated  to  fulfilling  the  mission  of  the 
highest-level  system. 

3  Database 

The  database  that  we  have  designed  to  store  the  domain 
data  has  many  of  the  usual  database  features.  It  stores  a 
collection  of  data  tokens  that  contain  the  domain  knowl¬ 
edge  and  has  a  set  of  indexing  structures  overlaid  on  these 
tokens  so  that  data  manipulations  based  on  the  domain  re¬ 
quirements,  such  as  data  retrieval,  may  be  implemented  effi¬ 
ciently.  Unlike  many  vision-system  databases,  the  database 
has  a  continuity  of  life  that  exceeds  a  single  execution  of  the 
system.  In  this  respect  it  is  much  more  like  a  conventional 
database,  whose  integrity  and  usefulness  must  persist  over 
an  extended  period.  Data  acquired  during  execution  of  the 
system  becomes  knowledge  stored  in  the  database  for  future 
use.  To  ensure  tbat  the  internal  integrity  of  the  database  is 
maintained,  proresses  do  not  have  direct  access  to  the  data 
tokens;  instead  copies  of  the  data  are  transferred  between 
the  database  and  the  process.  Clearly,  data  copying  is  com¬ 
putationally  expensive,  which  is  incompatible  with  real-time 
performance.  We  therefore  provide  a  mechanism  in  the  data 
access  language  that  allows  a  process  to  pass  a  procedure  to 
the  database  so  that  internal  processing  can  be  used  to  min¬ 
imize  the  data  transferred  and  the  amount  of  copying  that 
is  necessary. 

The  approach  we  adopt  for  controlling  integrity  is  dic¬ 
tated  by  a  development  environment  in  which  the  system 
is  not  built  by  a  single  person  or  group  but  rather  is  a  set 
of  processes  provided  by  disparate  implemented.  Protect¬ 
ing  the  data  from  being  corrupted  by  an  errant  process  is 
critical  if  we  want  to  avoid  rolling  back  the  database  to  a 
previous  version  or  editing  it  between  actual  uses.  However 
the  mechanism  used  to  reduce  data  copying,  sometimes  at 
tbe  expense  of  integrity,  is  desirable  for  certain  time-critical 
processes  if  real-time  performance  is  to  be  achieved. 

Because  all  processes  are  considered  equal  and  their  opin¬ 
ions  are  stored,  tbe  database  will  contain  conflicting  and  in¬ 
compatible  views  of  the  state  of  the  world.  Some  processes 
may  exist  solely  for  the  purpose  of  resolving  such  data  in¬ 
consistencies.  Of  course,  even  these  processes  will  only  be 
allowed  to  cast  an  opinion.  User  processes  may  choose  to 
take  more  notice  of  the  opinions  of  these  conflict  resolution 
processes  than  of  tbe  opinions  of  processes  whose  conclu¬ 
sions  are  drawn  from  less  data.  Tbe  conflict  resolution  pro¬ 
cesses  will  continually  process  data  in  the  database  (as  spare 
computational  resources  allow),  but  they  are  conservative  in 
nature,  preferring  not  to  cast  an  opinion  unless  they  have 
overwhelming  evidence  to  support  their  conclusion.  How¬ 
ever,  a  user  process  may  call  one  of  these  conflict  resolvers 
to  cast  an  opinion  even  if  it  would  Dot  have  otherwise  in¬ 
tervened.  Our  approach  then  is  to  allow  inconsistencies  to 
be  resolved  whenever  the  data  is  sufficient  to  support  the 
resolution,  or  whenever  a  user  process  requires  tbat  resolu¬ 


tion,  i.e.  at  access  time.  This  approach  differs  from  other 
approaches  that  attempt  to  maintain  a  consistent  data  set; 
in  these  approaches  resolution  must  occur  at  insertion  time. 
The  approach  we  adopt  is  to  resolve  if  necessary,  rather  than 
to  resolve  always.  Often  a  decision-making  process  can  take 
action  without  the  Deed  to  expend  resources  in  resolving 
data  discrepancies.  For  example,  the  navigation  module  of 
an  autonomous  land  vehicle  may  be  faced  with  tbe  conflict¬ 
ing  data  that  the  object  ahead  is  either  a  tree  or  a  telephone 
pole.  If  the  task  is  to  move  forward  avoiding  obstacles,  the 
vision  system  does  not  need  to  resolve  whether  the  object 
ahead  is  a  tree  or  telephone  pole.  The  resolution  require¬ 
ment  is  a  function  of  the  task,  not  simply  the  data. 

A  database  that  stores  opinion  will  rapidly  consume  stor¬ 
age  resources  unless  a  mechanism  Is  provided  that  will  allow 
data  to  be  deleted  or  at  least  archived.  A  process  that  is  the 
supplier  of  data  may  have  little  ability  to  evaluate  the  useful¬ 
ness  of  that  data,  yet  it  is  the  useful  data  that  we  would  want 
available  in  tbe  database.  The  approach  we  have  adopted  is 
to  have  processes  sponsor  data;  that  is,  a  process  (probably 
a  process  tbat  uses  a  particular  data  token)  will  allow  that 
data  token  to  be  “charged11  against  its  resource  allocation. 
Many  processes  can  sponsor  a  single  data  token,  and  they  are 
charged  proportionately.  When  a  process  nears  its  resource 
limit  (or  at  any  time)  it  can  withdraw  its  sponsorship  of  any 
data  that  it  has  sponsored.  Data  (hat  are  unsponsored  are 
available  for  garbage  collection  (these  data  may  be  archived 
or  deleted).  In  this  manner  each  process  is  responsible  for 
deciding  what  data  it  finds  useful,  and  this  collection  of  data 
forms  the  base  of  current  available  information.  Clearly,  this 
procedure  is  not  fail  safe.  Critical  data  may  be  removed  be¬ 
fore  their  criticality  is  realized.  However,  the  criticality  of 
data  is  measured  in  terms  of  a  process's  willingness  to  pay 
for  it  and  presumably  in  terms  of  the  current  usefulness  of 
tbat  data. 

Although  a  data  token  is  unsponsored,  it  will  not  neces¬ 
sarily  be  removed  immediately.  An  information  producer 
may  not  wish  to  sponsor  data  for  which  it  has  little  use.  so 
it  may  be  some  time  before  a  sponsor  for  this  information  is 
found.  To  avoid  deleting  useful  data,  the  process  whose  job 
is  to  remove  data  tokens  evaluates  additional  information, 
such  as  length  of  time  the  token  has  been  in  the  database,  as 
well  as  sponsorship  information,  before  it  is  removed.  Data 
removal  is  a  continuous  process,  so  that  the  database  can  be 
assured  of  having  adequate  storage  when  time-critical  tasks 
demand  that  computational  resources  for  garbage  collection 
be  suspended. 

Each  process  in  the  system  does  not  have  the  same  re¬ 
source  allocation.  At  particular  times  some  processes  may 
be  more  valuable  than  others.  One  process  bas  the  task  of 
allocating  datahase  resources  to  the  other  tasks.  The  al¬ 
location  is  based  on  the  frequency  with  which  data  tokens 
produced  by  a  process  are  consumed  by  another  process. 
Such  a  frequency  measure  is  a  moving  statistic  that  allows 
the  allocation  to  adapt  to  the  current  situation.  As  is  usual, 
data  tokens  are  time  stamped  to  indicate  the  last  time  they 
were  modified  —  that  is,  tbe  last  time  a  new  opinion  was 
added  to  one  of  the  data  slots  —  and  they  are  time  stamped 
for  last  use.  The  time  stamps  provide  data  for  tbe  resource 
allocator  and  the  garbage  collector. 


5 


Data  tokens  are  produced  by  individual  processes  and  are 
passed  to  the  database  for  storage  and  subsequent  retrieval. 
For  tbe  database  to  access  information  from  within  the  to¬ 
ken,  or  for  a  requesting  process  to  be  able  to  extract  infor¬ 
mation  from  a  token,  each  must  either  know  the  form  of  that 
information  or  have  some  procedure  for  recovering  it.  In  the 
design  of  a  system  we  can  choose  to  use  a  standard  struc¬ 
ture  for  a  data  token,  such  as  a  record  structure  in  which  the 
position  of  parameter  slots  are  known,  or  we  can  use  a  stan¬ 
dard  syntax  for  the  token,  such  as  a  list  of  attribute-value 
pairs,  or  we  can  by  procedural  attachment  add  functions 
that  retrieve  values  from  the  internal  data  structure  of  the 
token. 

With  standard  structures,  position,  rather  than  name, 
gives  us  access  to  the  data  hut  we  require  all  processes  to  use 
some  predetermined  set  of  structures.  In  a  system  in  which 
different  processes  do  entirely  different  tasks,  it  is  unlikely 
that  one  could  find,  no  matter  how  clever,  a  single,  or  small, 
set  of  representations  that  would  be  natural  representations 
of  the  data  for  all  the  processes  that  must  have  access  to 
that  data. 

With  both  fixed  syntax  and  procedural  attachment  to  a 
data  structure,  a  vocabulary  of  terms  is  needed  to  access 
the  data  slots.  This  is  the  approach  we  take.  We  use  a 
vocabulary  of  terms  that  spans  the  entities  and  relationships 
of  interest  in  the  application  domain.  For  an  autonomous 
land  vehicle,  the  vocabulary  consists  of  words  or  labels  that 
describe  the  outdoor  environment,  e.g.  tree  and  height,  so 
a  process  could  ask  a  data  token  that  represented  a  tree 
for  that  tree's  height.  The  actual  structure  used  to  hold 
the  data  can  be  invisible  to  the  user  who  gains  access  to 
the  information  through  the  labels.  The  labels  must  to  he 
known  by  all  processes  that  wish  to  access  this  information 
in  the  database.  This  semantic  level  does  seem  to  be  the 
appropriate  level  on  which  to  share  information. 

Should  we  use  a  fixed  syntax  like  attrihute-value  pairs  to 
hold  the  information  in  the  database  and  provide  a  simple 
routine  to  retrieve  the  value  given  the  attrihute,  or  should 
we  use  the  more  complex  approach  of  attaching  to  a  data 
structure  a  set  of  functions  that  can  retrieve  the  value  of  a 
data  slot  given  the  slot  name?  We  take  tbe  latter  approach 
to  increase  the  functionality  that  is  available  when  we  re¬ 
trieve  a  value  based  on  slot  name.  Fbom  the  point  of  view  of 
systems  building,  in  which  parts  or  the  system  are  built  hy 
independent  groups,  this  approach  places  the  decisions  for 
the  form  of  the  data  structure  and  the  accessing  function¬ 
ality  within  one  group  and  provides  a  clean  interface  with 
the  database.  Each  process  can  now  select  its  own  internal 
representations  for  the  data  it  produces,  and  that  data  can 
he  shared  through  access  functions  that  are  based  on  terms 
or  labels  in  the  vocabulary  w-hich  describes  the  underlying 
domain.  A  common  vocabulary’  requires  that  each  process 
know  how  to  translate  from  its  internal  representation  to 
information  in  vocabulary  form.  This  avoids  the  need  for 
each  process  to  know  how  to  translate  into  the  individual 
representations  used  hy  other  processes.  Additionally,  new 
processes  can  be  added  to  the  system  without  retrofitting 
the  new  representations  to  the  older  processes. 

A  collection  of  data  tokens  is  not  a  database  unless  there 
is  a  means  of  accessing  the  information  in  the  collection  in 
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Figure  2:  Transaction  Processing  —  Autonomous  Land  Ve¬ 
hicle  Database.  Access  to  the  data  store  is  by  means  of  a 
spatial  directory,  a  semantic  directory,  or  both. 


a  manner  that  does  not  require  a  search  through  the  en¬ 
tire  set.  A  set  of  indexing  structures  that  allows  access  in 
a  more  direct  manner  must  be  based  on  the  subsets  of  the 
data  that  Deed  to  be  retrieved.  These  structures  are  there¬ 
fore  based  on  the  domain  requirements  and  relate  to  the 
semantics  of  the  actual  data  stored.  Our  architecture  for 
sensory  integration  is  implemented  in  the  task  domain  of  an 
autonomous  land  vehicle  navigation.  The  indexing  struc¬ 
tures  that  we  use  are  associated  with  the  need  to  retrieve 
information  that  is  appropriately  grouped  for  the  task  of 
navigation  in  the  three-dimensional  world.  A  ipatial  direc¬ 
tory  that  forms  subsets  of  the  data  hased  on  spatial  location, 
and  a  jemonfi'r  directory  that  forms  subsets  of  the  data  based 
on  object  class  are  the  principle  indexing  schemes  that  we 
use  to  organise  storage  and  retrieval  of  data  tokens.  Figure 
2  gives  an  overview  of  transaction  processing  by  means  of 
directories  in  the  database  designed  to  support  autonomous 
vehicle  navigation. 


4  Spatial  Directory 

The  spatial  directory  organizes  the  data  tokens  into  groups 
determined  by  spatial  location.  Because  an  autonomous  ve¬ 
hicle  may  roam  about  in  an  extensive  environment,  we  need 
a  representation  of  that  environment  that  can  deal  with  its 
spatial  extent.  In  addition,  the  representation  must  be  effi¬ 
cient  in  indexing  data  when  the  data  are  distributed  nonuni- 
formly  over  the  environment-  Data  will  need  to  be  accessed 
at  various  levels  of  resolution  depending  on  the  task  that 
is  being  addressed.  Route  planning  needs  lower-resolution 
data  than  does,  for  example,  landmark  identification  or  ob¬ 
stacle  avoidance.  Particular  data  may  need  to  be  stored  at 
multiple  levels  of  resolution  to  match  the  requirements  of 
different  tasks.  The  world  is  three-dimensional  hut  the  ve¬ 
hicle  is  restricted  to  a  two-dimensional  surface  embedded  in 
this  world.  Although  there  are  many  reasons  for  choosing  a 
two-dimensional  index,  such  as  latitude  and  longitude,  and 
then  representing  the  third  dimension  as  a  data  value,  we 
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chose  to  use  a  three-dimensional  index.  Our  selection  was 
motivated  by  the  advantage  such  an  index  gives  in  encod¬ 
ing  spatial  relations  within  the  directory,  in  generating  vis¬ 
ibility  information,  and  in  using  this  architecture  in  other 
spatial  domains  in  which  movement  is  not  restricted  to  a 
two-dimensional  surface. 

The  three-dimensional  index  selects  a  volume  in  space 
that  we  represent  as  voxel)  |4j.  The  largest  voxel  is  the 
world,  which  is  subdivided  into  smaller  volumes  as  we  need 
to  represent  spatial  position  with  higher  precision.  The  in¬ 
dex  granularity  is  fine  enough  to  be  able  to  position  an  object 
in  a  volume  that  is  precise  enough  for  the  application.  Recall 
that  this  index  is  an  index  into  a  directory;  in  the  directory 
cell  are  pointers  to  the  data  tokens  associated  with  the  vol¬ 
ume  of  space  represented  by  this  index.  Data  tokens  need 
not  be  placed  in  the  directory  at  the  finest  index  available 
but  only  at  the  precision  with  which  their  spatial  location  is 
known.  A  tree  whose  position  is  unknown  would  be  placed 
in  the  largest  voxel;  this  voxel  represents  the  entire  world. 

The  voxel-based  directory  not  only  gives  a  range  of  posi¬ 
tion  resolutions,  it  also  allows  different  parts  of  the  world  to 
use  different  resolutions  for  storing  data.  Parts  of  tbe  world 
tbat  have  little  data  associated  with  them  may  choose  to 
place  all  the  pointers  to  data  tokens  representing  objects  in 
this  area  in  coarse-grained  volumes,  while  the  part  of  the 
world  in  which  the  vehicle  is  active  can  be  subdivided  into 
finely  partitioned  volumes.  We  not  only  have  multiple  reso¬ 
lution.  but  we  also  can  select  resolution  relevant  to  the  area 
concerned. 

In  selecting  a  voxel-based  representation  of  space,  we  have 
the  option  of  dividing  that  space  into  regular  voxels  in  which 
all  voxels,  at  a  given  level  of  subdivision  of  the  space,  are 
of  equal  size,  or  we  can  choose  to  divide  the  space  into  ir¬ 
regularly  sized  chunks.  Irregularly  sized  voxels  have  some 
attractions,  as  they  allow  irregularly  shaped  objects  to  be 
confined,  and  hence  indexed,  within  a  volume  that  matches 
them.  Regularly  sized  voxels  often  are  unnecessarily  large 
when  they  are  large  enough  to  contain  a  irregularly  ahaped 
object.  However,  if  we  use  irregularly  sized  voxels  we  may 
need  multiple  indices  to  allow  for  overlapping  voxels  that  are 
indexing  different  irregularly  sized  objects  in  the  same  vol¬ 
ume  of  space.  Multiple  indiees  increase  the  computational 
load,  and  we  are,  after  all,  trying  to  index  data  tokens  in 
an  efficient  manner.  We  therefore  use  a  regular  subdivision 
of  space  in  which  each  voxel  is  subdivided  into  eight  equally 
sized  and  shaped  smaller  voxels. 

In  making  this  choice  we  must  address  tbe  problem  of  in¬ 
dexing  objects  whose  shape  does  not  match  this  partitioning 
of  space.  Generally,  it  is  easy  to  place  stationary  compact 
ohjccts  within  a  voxel  that  can  completely  contain  them,  but 
objects  like  linear  structures,  surfaces,  and  moving  objects 
require  alternative  approaches.  Lineae  structures  like  roads, 
rivers,  telephone  wires,  and  fences  are  stored  as  one  data  to¬ 
ken,  but  pointers  are  placed  in  all  tbe  voxels  through  which 
the  structure  passes.  We  use  tbe  smallest-sized  voxels  tbat 
are  appropriate;  for  example,  the  voxel  size  for  a  road  will 
be  determined  by  the  road  width  so  that  we  can  be  assured 
that  the  road  “fits"  within  the  voxel. 

The  same  approach  is  taken  with  other  extended  objects, 
such  as  surfaces:  a  single  data  token  has  pointers  to  it  from 


the  set  of  voxels  through  which  the  surface  passes.  The  size 
of  the  voxel  is  selected  by  the  process  inserting  tbe  surface 
into  the  database,  based  on  such  factors  as  accuracy  of  tbe 
surface  shape,  and  extent.  Recall  that  tbis  placement  in 
space  is  to  aid  retrieval,  not  to  specify  exactly  where  things 
are.  Detailed  location  information  in  available  from  within 
tbe  data  token.  There  is  no  need  to  place  objects  in  tbe 
spatial  directory  in  the  smallest  voxel  that  might  be  possible. 

Moving  ohjects  are  usually  compact  objects  so  they 
present  little  problem  in  placement  at  their  current  posi¬ 
tion,  but  there  may  be  times  when  we  want  their  track  rep¬ 
resented  in  the  directory.  We  use  the  same  approach  we  used 
for  linear  structures  and  extended  objects;  we  represent  tbe 
moving  objects  with  a  single  data  token  and  point  to  the 
token  from  voxels  associated  with  its  track. 

An  advantage  of  a  muitiresolution  spatial  directory  is  the 
ease  with  which  we  can  represent  approximate  location.  We 
place  an  object  in  a  voxel  that  is  large  enough  to  contain  (he 
limits  of  its  possible  locations.  Object  location  may  be  ap¬ 
proximate  because  of  image  processing  errors  when  detect¬ 
ing  objects  in  imagery,  or  because  we  do  not  know  our  exact 
position  when  we  make  an  observation.  The  latter  is  partic¬ 
ularly  relevant  in  tbe  ease  of  an  autonomous  vehicle.  Data 
can  be  added  to  the  database  before  its  position  is  known, 
and  then,  when  better  location  information  is  known,  the 
directory  can  be  updated  by  moving  the  data  to  a  smaller 
volume.  If  this  is  not  done  the  data  will  be  retrieved  and 
examined  when  requests  are  processed  for  data  from  the 
original  larger  voxel.  A  background  process  who^e  task  is 
to  move  objects  to  their  most  precise  location  within  the 
directory  (when  processing  resources  are  available)  accom¬ 
plishes  the  directory  update  and  thereby  achieves  retrieval 
efficiency.  Hence  all  data  can  be  directly  inserted  into  one 
directory  whether  their  location  is  known  accurately  or  only 
approximately. 

Having  all  data,  whether  its  position  is  known  or  uncer¬ 
tain,  within  one  directory  structure  allows  us  to  respond 
easily  to  data  retrieval  requests  that  want  “all  objects  tliBt 
are  within  a  certain  volume  in  space'  as  well  as  “all  objects 
that  are  possibly  within  that  particular  volume  of  space.' 
Clearly,  in  the  task  domain  of  an  autonomous  land  vehicle, 
knowing  what  might  be  ahead  and  what  i>  ahead  is  necessary 
for  competent  navigation  and  obstacle  avoidance.  Within 
the  voxel  structure,  ‘within  a  volume’  maps  to  the  tree  of 
voxels  below  (finer  than)  the  voxel  containing  the  volume 
while  “possihly  within  a  volume’  maps  to  the  tree  above 
(coarser  than)  tbe  voxel  containing  the  volume.  When  data 
eanbe  retrieved  on  the  basis  of  their  location,  then  retrievals 
on  tbe  basis  of  spatial  relations  are  also  possible. 

The  spatial  directory  encodes  tbe  spatial  relationships  be¬ 
tween  items  stored  in  the  database.  As  objects  are  moved 
or  their  spatial  positions  refined,  these  spatial  relations  are 
maintained  without  additional  processing  resources.  New 
objects  entered  into  tbe  database  encode  their  spatial  rela¬ 
tionship  with  previously  entered  data.  In  our  task  domain, 
we  expect  to  retrieve  items  based  on  relative  position  —  ob¬ 
jects  to  tbe  right  of  the  road,  trees  casting  shadows  on  tbe 
road,  and  so  on.  Having  an  indexing  structure  that  matches 
tbe  world  structure  allows  tbis  without  tbe  overhead  that 
would  be  presented  by  alternative  schemes,  such  as  a  rela- 
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Figure  3:  Octree  Representation  of  the  Voxel  Description  of  Space.  Hash  tahles  are  used  to  implement  the  octree;  more  than 
one  level  of  the  octree  is  stored  in  a  single  hash  table. 


tional  datahase. 

The  reduction  of  computational  resources  used  to  main¬ 
tain  the  database  was  also  instrumental  in  our  treatment  of 
time.  The  datahase  is  always  assumed  to  represent  the  world 
at  current  time.  If  historical  information  is  to  be  stored, 
then  it  must  he  time-stamped,  otherwise  it  is  implied  that 
the  data  reflect  the  state  of  the  world  as  it  currently  is.  We 
adopted  this  approach  so  that  we  could  avoid  elements  of 
the  traditional  frame  problem  [l J:  if  time  is  a  parameter 
of  the  data  token,  then  this  token  has  to  be  updated  even 
when  the  real  data  has  not  changed  but  time  has  passed.  We 
take  the  usual  approach  adopted  in  conventional  databases, 
in  that  information  is  assumed  to  be  still  true  if  it  has  not 
been  altered  or  specifically  marked  as  applying  only  to  some 
particular  interval  of  time. 

Voxels  are  the  representation  of  the  world  used  in  the  spa¬ 
tial  directory,  but  there  is  the  independent  issue  of  how  we 
represent  voxels  in  our  implementation  of  the  spatial  direc¬ 
tory.  We  use  a  “pointerless"  octree  [3]  that  itself  is  imple¬ 
mented  by  multiple  hash  tables.  The  use  of  an  octree  to 
implement  a  voxel  representation  is  natural;  our  selection 
of  the  pointerless  approach  was  based  on  the  expectation 
that  many  voxels  will  contain  no  data,  and  many  voxels  will 
not  be  subdivided  Into  smaller  units.  Hence  the  more  usual 
approach  of  using  cells  with  explicit  pointers  to  the  finer 
cells  will  produce  many  cells  containing  mainly  null  point¬ 
ers.  With  the  pointerless  approach,  only  voxels  that  contain 
data  tokens  are  allocated  any  storage,  and  null  pointers  are 
not  used.  Figure  3  shows  an  abstract  view  (using  null  point¬ 
ers)  of  the  way  we  use  an  octree  to  represent  the  voxel  de¬ 
scription  of  the  world.  The  actual  implementation  uses  hash 
tables  to  store  the  links  between  voxels.  The  number  of  lev¬ 
els  of  hash  tables  is  in  fact  somewhat  less  than  the  number 


of  octree  levels,  because  several  octree  levels  are  stored  in  a 
single  hash  table,  as  shown  in  Figure  3. 

5  Semantic  Directory 

The  spatial  directory  provides  an  indexing  scheme  that 
matches  the  spatial  nature  of  the  data  in  the  task  domain; 
the  semantic  directory-  provides  an  indexing  scheme  that 
matches  the  semantic  nature  of  the  data  in  that  domain. 
As  previously  mentioned,  we  use  a  vocabulary  of  terms  to 
facilitate  communication  between  processes.  The  semantic 
directory  specifies  these  terms  and  defines  the  set  of  connec¬ 
tions  between  them.  The  vocabulary  provides  a  set  of  labels 
that  is  used  to  describe  the  data  tokens  in  the  database. 
Such  a  set  is  dependent  on  the  task  domain,  and  for  au¬ 
tonomous  land  vehicles  we  use  terms  that  label  objects  in 
the  outdoor  environment,  such  as  tree,  road,  rock,  meadow, 
or  ditch,  as  well  as  terms  with  less  specificity,  such  as  im¬ 
movable-object,  ohstacle,  or  object. 

The  need  for  terms  that  define  the  semantics  of  things  in 
the  world  at  various  levels  of  abstraction  or  multiple  levels 
of  resolution  is  apparent  if  we  wish  to  interpret  imagery’  as 
seen  from  a  moving  vehicle:  objects  usually  appear  first  at 
a  distance,  at  poor  resolution,  and  gradually  change  form  as 
we  approach  them.  The  levels  of  abstraction  that  we  need 
are  a  function  of  the  processes  we  have  and  their  ability 
to  instantiate  the  terms.  There  is  no  point  in  being  able 
to  describe  leaves  on  a  tree  if  the  sensors  are  incapable  of 
resolving  objects  that  small.  Equally  there  is  no  point  in 
describing  trees  as  belonging  to  the  superset  “wooden  ob¬ 
jects'  if  no  process  makes  use  of  that  set.  The  vocabulary 
choice  that  we  have  made  is  based  on  our  assessment  of  the 
competence  of  low-level  image  processing  routines  and  the 
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Figure  4;  Semantic  Directory.  Implemented  as  a  semantic  network,  it  gives  access  to  the  daLabase  data  tokens  by  means  of 
their  semantic  type. 


requirements  of  higher-level  processes.  The  choice  is  criti¬ 
cal  to  sensory  integration,  for  within  the  vocabulary  we  are 
restricting  the  means  of  integration,  the  information  that 
higher-level  processes  can  transfer  to  the  low-level  routines 
(and  vice  versa),  and  the  functionality  requirements  of  hoth 
higher  and  lower-level  processes.  In  absolute  terms,  success¬ 
ful  sensor  integration  demands  selection  of  an  appropriate 
vocabulary. 

Any  vocabulary  whose  constituent  terms  span  a  wide 
range  of  specificity  in  a  domain  must  include  terms  that  are 
related  to  one  another.  The  second  component  of  the  seman¬ 
tic  directory,  a  semantic  network  (l|,  defines  these  connec¬ 
tions.  Tbe  network  itself  baa  two  parts:  one  which  defines 
the  specialization  of  terms  by  a  graph,  that  is,  a  lattice  that 
specifies  su bsel/superaet  relations  and  that  is  augmented  hy 
the  inclusion  of  the  disjoint  set  relation,  and  a  second  part 
that  describes  tbe  decomposition  of  composite  objects  into 
parts.  While  tbe  first  part  indicates  relationships  that  must 
hold,  such  as  “a  pine  tree  is  a  tree*  the  second  decomposes 
composite  objects  into  parts  that  are  usually  present,  such 
as  “fire  engines  usually  have  ladders.*  The  first  part  of  the 
network  is  used  for  inference;  for  example,  in  inferring  that  a 
pine-tree  is  a  tree,  which  is  an  immovable  object,  which  is  an 
object,  and  so  on.  Tbe  second  part  gives  default  values  that 
may  be  used  to  trigger  some  process  to  find  them,  or  may  be 
used  by  an  evidential  reasoning  process  that  is  attempting, 
say,  to  classify  an  object  based  on  what  bas  been  detected 
and  what  one  might  expect  to  see  when  viewing  that  par¬ 
ticular  object.  For  example,  when  a  process  is  attempting 
to  decide  whether  an  object  that  is  composed  of  several  ver¬ 
tical  rectangular  objects  and  some  horizontal  lines  could  be 
a  portion  of  a  fence,  knowledge  of  tbe  expected  parts  of  a 
fence  is  crucial  to  that  determination.  Additionally,  tbe  net¬ 
work  provides  a  means  for  inheriting  properties  from  a  more 


general  class;  e.g.  if  a  tree  is  usually  composed  of  branches, 
leaves,  and  a  trunk,  then  a  subclass,  like  pine  .trees,  will 
inherit  this  parts  decomposition  as  its  default  description. 
The  approach  taken  reflects  the  need,  on  one  hand,  for  the 
system  to  reason  shout  objects,  while,  on  the  other  hand, 
the  system  must  be  able  to  recognize  composite  objects  on 
the  basis  of  tbeir  likely  parts.  A  mechanism  for  logical  in¬ 
ference  and  a  mechanism  for  object  decomposition  that  is 
usual  but  not  unequivocal  must  therefore  be  provided. 

The  semantic  network  we  use  is  implemented  as  a  graph 
in  which  the  nodes  represent  the  vocabulary  items  and  the 
labeled  arcs  represent  tbe  relationships  among  terms.  Both 
the  subset/superset  and  disjoint  set  relations,  together  with 
the  composite  object  decomposition,  are  combined  on  tbe 
one  graph  using  various  labels  on  the  arcs  to  distinguish 
between  them.  For  example,  the  lattice  fragment 


I  A  1 - - »j  B  I 

encodes  the  sentence  Vx  :  (A(x)  =>  ZJ(i)),  while 


c 

Disjoint 

encodes  -i3x  :  (C(x)  A  D(x)).  This  network  representation 
allows  selected  inferences  to  be  made  rapidly  through  graph 
operations.  The  particular  implementation  allows  display  of 
the  semantic  network  in  its  entirety  or  of  selected  clusters 
of  related  information.  Figure  4  shows  a  small  part  of  our 
semantic  network.  Tbe  graphical  display  of  the  network  is 
tbe  interface  we  use  to  build  the  semantic  directory  and  to 
add  new  words  and  relations  to  our  vocabulary. 

Each  Dode  of  the  semantic  network  is  associated  with  a 
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vocahutary  term,  and  to  it  we  attach  pointers  to  all  the  data 
tokens  in  the  database  that  have  been  labeled  with  this  term. 
The  nodes  of  the  semantic  network  can  be  accessed  by  the 
vocabulary  label,  and  thus  provide  a  directory  to  data  tokens 
on  the  basis  of  the  semantic  label. 

Although  we  view  the  semantic  directory  as  a  graph  struc¬ 
ture  and  display  the  semantic  network  as  a  graph,  the  imple¬ 
mentation  uses  bash  tables  for  speed  of  access.  When  data 
tokens  are  added  to  the  database  or  when  additional  labels 
are  added  to  a  token’s  description,  the  semantic  directory  is 
updated  appropriately. 

Data  tokens  are  attached  to  the  most  specific  network 
nodes  possible.  If,  for  example,  a  data  token  had  been 
laheled  by  a  process  as  heing  a  paved .road,  then  it  is  at¬ 
tached  only  to  the  semantic  network  node  for  paved  jo  ad 
even  though  all  paved  roads  are  known  to  be  roadways.  This 
approach  was  adopted  to  save  storage  as  well  as  to  provide 
a  straight-forward  implementation  of  the  retrieval  request 
to  return  all  objects  tbat  are  paved  .roads  as  opposed  to  all 
objects  that  might  be  paved-roads.  The  second  descriptor 
includes  objects  in  the  more  genera)  class  “roadways”  as  well 
as  those  labeled  “paved  .roads.”  Paved  .roads  are  found  at¬ 
tached  to  the  nodes  of  the  lattice  that  form  the  tree  rooted  at 
the  node  labeled  paved_road,  whereas  roadways  tbat  might 
be  paved  joads  are  found  attached  to  the  nodes  of  the  net¬ 
work  tree  above  the  node  labeled  paved  Joad.  Tbis  arrange¬ 
ment  parallels  the  mechanisms  used  in  the  spatial  directory 
to  find  objects  that  are  at  a  particular  location,  as  opposed 
to  those  that  might  be  at  that  location.  It  is  the  responsibil¬ 
ity  of  the  access  routines  to  retrieve  the  appropriate  items 
from  tbe  datahase  by  means  of  the  semantic  network. 

The  semantic  network  serves  partly  as  a  definition  of  the 
meaning  of  concepts.  If  a  process  designer  wishes  to  know 
what  questions  he  can  ask  of  a  data  token  that  is,  for  exam¬ 
ple,  a  tree,  the  network  specifies  tbe  relevant  terms,  such  as 
height,  or  color.  The  semantic  network  defines  more  than 
just  the  communication  language  between  processes;  it  de¬ 
fines  something  of  the  domain  concepts  tbat  all  processes 
must  use.  However,  while  the  concept  ‘tree,’  for  example, 
may  be  seen  in  tbe  semantic  network  to  include  pine-trees, 
and  oak.trees,  and  so  on,  and  while  a  tree  is  an  immov¬ 
able-object  and  an  object,  and  while  it  has  parts  (and  prop¬ 
erties)  of  height,  and  color,  it  is  not  ’defined’  by  tbe  net¬ 
work.  The  network  does  not  define  for  a  process  the  concept 
’tree;”  it  specifies  only  the  concepts  that  processes  can  use 
to  communicate  about  a  tree.  A  particular  process  may 
determine  tbat  an  object  is  a  pine-tree  on  the  basis  of  its 
temperature  and  the  soil  type  around  it,  but  it  must  share 
its  information  in  terms  of  tbe  concepts  defined  in  the  net¬ 
work.  This  approach  was  adopted  for  important  pragmatic 
reasons  —  it  is  impossible  to  “define’  a  concept  like  a  tree; 
yet  we  need  to  communicate  information  about  a  tree  in 
terms  that  other  processes  understand. 

6  Other  Directories 

The  system  architecture  we  have  described  is  independent  of 
the  indexing  structures  that  are  overlaid  on  the  database;  to 
change  those  structures  requires  only  changes  to  the  parser 


that  processes  database  requests  (as  can  he  seen  in  Figure 
2).  The  extensibility  of  the  directory  system  allows  future 
requirements  to  he  accommodated  without  change  to  the 
overall  system  structure.  The  two  directories  we  describe 
were  devised  to  allow  an  autonomous  land  vehicle  to  navi¬ 
gate  through  a  world  in  which  most  ohjects  are  static  and 
motion  comes  primarily  from  the  movement  of  the  vehicle 
itself.  In  other  sceneries,  this  will  he  inadequate.  In  en¬ 
vironments  in  which  there  are  many  moving  objects,  and 
fast-moving  ohjects  that  are  likely  to  impact  tbe  mission  re¬ 
sults,  other  directories  tbat  index  tbe  database  through  addi¬ 
tional  parameters,  such  as  those  associated  with  movement, 
are  vital.  The  architecture  described  has  the  flexibility  to 
accommodate  such  extensions. 

7  Process  Control 

We  have  described  the  various  processes  tbat  form  the  sys¬ 
tem  as  independent,  asynchronous  processes  that  can  be  ac¬ 
tivated  he  means  of  daemons  imbedded  in  the  database  or 
by  more  conventional  procedure  calls.  Each  method  uses  vo¬ 
cabulary  terms  to  interact  with  the  database.  Each  process 
is  continuously  executing,  although  a  process  may  put  itself 
to  sleep  only  to  be  awakened  when  predetermined  data  con¬ 
ditions  exist.  Who  determines  these  conditions?  Should  ev¬ 
ery  process  he  permitted  to  determine  the  conditions  needed 
to  interrupt  another  process?  Some  processes  may  be  time 
critical  and  prefer  not  to  be  interrupted.  Our  approach  i* 
to  require  that  the  process  itself  set  these  conditions  within 
the  database.  Any  process  can  attach  one  of  its  daemons  to 
any  data  slot  of  any  data  token,  so  that  the  process  will  be 
interrupted  whenever  any  new  or  changed  opinion  modifies 
that  data  slot.  We  selected  data  slots  rather  than  data  to¬ 
kens  as  tbe  items  on  which  to  attach  daemons  because  data 
tokens  usually  represent  a  complex  item  and  any  one  pro¬ 
cess  is  probably  interested  in  only  some  aspects  of  it:  for 
example,  the  navigational  module  of  an  autonomous  vehicle 
will  want  to  be  interrupted  if  a  sensor  process  gives  a  new 
opinion  on  the  position  of  an  obstacle,  but  it  is  unlikely  to 
need  to  be  interrupted  if  the  obstacle’s  color  changes.  It  is, 
therefore,  tbe  responsibility  of  a  process  to  determine  when 
it  is  to  be  interrupted. 

In  a  like  manner,  it  is  the  process  that  determines  what 
action  to  take  when  it  is  interrupted.  The  interrupt  han¬ 
dler  is  part  of  the  definition  of  each  process.  As  processes 
are  quite  varied,  there  is  no  sense  to  tbe  notion  of  a  generic 
interrupt  handler.  Clearly,  processes  may  choose  to  con¬ 
tinue  with  what  they  are  doing  rather  than  to  process  the 
interrupt  if  they  assess  the  current  task  to  be  more  relevent 
to  mission  success  than  that  associated  with  the  interrupt. 
Conversely,  a  process  may  instead  suspend  or  abandon  what 
it  is  doing  in  favor  of  the  interrupt.  The  overall  system  con¬ 
cept  is  that  of  a  loosely  coupled  system  in  which  all  processes 
work  on  their  goals  cognitant  of  the  overall  goals  of  the  mis¬ 
sion.  Each  process  determines  bow  it  can  best  support  the 
mission  goals  and  is  responsible  for  tbe  means  to  achieve 
this. 

Tbe  process  architecture  we  use  parallels  tbat  of  black¬ 
board  systems  that  were  brought  to  prominence  in  the  build- 
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ing  of  speech  understanding  systems  [2].  In  these  systems 
data  were  placed  on  a  blackboard;  if  the  combination  of 
data  on  the  blackboard  met  the  preconditions  for  a  particu¬ 
lar  procedure  to  execute,  then  that  procedure  was  triggered 
and  put  on  the  schedule  for  computing  resources.  In  an  im¬ 
portant  way,  the  approach  of  activating  processes  using  dae¬ 
mons  differs  from  the  triggering  mechanism  used  on  black¬ 
board  systems.  We  do  not  have  a  pattern  matcher  whose  job 
is  to  trigger  processes  when  a  particular  pattern  of  data  ap¬ 
pears  in  the  database  (or  on  the  blackboard).  For  efficiency 
reasons,  the  patterns  that  pattern  matchers  are  to  recognize 
must  be  predetermined  and  compiled  in  at  system  building 
time.  In  a  system  that  is  loosely  coupled,  in  which  differ¬ 
ent  processes  may  be  present  during  different  executions  of 
the  system  —  in  which  the  system  must  function  even  if 
some  of  the  processes  (or  hardware)  fail  —  an  approach  to 
pattern  matching  that  decentralizes  the  responsibility  for  de¬ 
termining  whether  a  process  should  be  triggered  seems  more 
manageable.  We  have  chosen  to  trigger  on  an  opinion  heing 
changed  rather  than  on  a  particular  pattern  in  the  data  it¬ 
self.  In  selecting  this  mechanism,  we  weighed  the  cost  of  the 
additional  processing  that  is  done  by  the  interrupt  handler 
in  each  process  against  the  computational  cost  of  running  a 
generalized  pattern  matcher. 

In  any  system  that  is  a  collection  of  processes,  priority 
will  sometimes  need  to  be  given  to  processes  that  perform 
time-critical  tasks.  At  other  times,  the  system  could  be 
underutilized.  As  a  result  some  processes  should  be  sched¬ 
uled  as  foreground  jobs,  which  compete  for  resources  wheD 
they  request  them,  while  others  should  be  background  pro¬ 
cesses  using  only  spare  resources.  We  identified  some  of  the 
background  processes:  the  module  that  resolves  data  incon¬ 
sistencies,  the  one  that  recovers  storage  space,  and  parts  of 
the  resource  allocator  itself.  The  system  should  never  be 
idle.  We  allocate  computational  resources  to  modules  via  a 
separate  process,  a  metalevel  process,  that  changes  the  time 
slice  allocated  to  various  processes.  A  process  that  produces 
data,  including  opinions,  that  are  used  by  other  processes 
gets  more  resources  than  a  producer  of  unused  data.  In  ad¬ 
dition,  a  process  can  request  more  resources  if  it  determines 
such  a  need,  so  that  critical  processes  can  ask  for  priority. 

Our  current  system  implementation  is  one  in  which  all 
the  various  processes  execute  on  one  computer  system  and 
all  interact  through  a  common  virtual  address  space.  This 
approach  was  adopted  to  eliminate  the  system  building  nec¬ 
essary  to  run  experiments  on  multiple  processors.  However, 
the  design  of  the  system  assumes  a  virtual  environment  in 
which  there  are  many  processors  running  in  parallel,  with 
a  communications  network  between  them.  This  accounts 
for  the  design  decision  of  the  rather  loose  coupling  between 
processes.  On  a  network  of  parallel  processors,  we  would  ex¬ 
pect  some  processors  to  be  dedicated  to  particular  procesaes 
whose  computational  task  is  matched  to  the  particular  ma¬ 
chine  hardware.  Other  processes  would  be  allocated  among 
the  available  processors.  Although  we  are  aware  of  the  bot¬ 
tleneck  that  might  be  caused  hy  centralizing  the  database  we 
envisage  a  system  in  which  the  process  accepting  requests  for 
database  transactions  will  be  centralized  but  the  database 
itself  and  the  procedures  that  carry  out  the  internal  process¬ 
ing  may  be  split  across  processors. 


8  Tasks 

The  information  system  that  we  have  described  presupposes 
that  the  job  of  interpreting  sensor  data  can  be  subdivided 
into  pieces,  and  that  it  is  the  combination  of  the  process¬ 
ing  provided  by  these  pieces  (using  stored  knowledge)  that 
achieves  sensor  interpretation.  The  goal  of  sensor  interpre¬ 
tation  is  clear:  we  need  to  build  a  model  of  the  world  that 
is  being  sensed,  but  what  the  individual  tasks  are,  and  how 
stored  knowledge  is  used  is  not  obvious. 

In  the  vision  literature  we  see  a  wide  range  of  experiments 
that  have  probed  for  an  answer  to  the  first  of  these,  identify¬ 
ing  the  tasks.  However,  this  question  has  usually  been  posed 
in  a  context  in  which  little  data  were  available,  save  the  sen¬ 
sory  signal.  In  a  system  like  the  one  we  describe,  knowledge 
is  vitally  important;  we  might  therefore  expect  the  division 
to  be  into  tasks  that  are  somewhat  different  to  those  used 
when  the  signal  is  the  only  data.  To  be  more  specific  let  us 
look  at  an  example  of  how  the  availability  of  knowledge  may 
influence  sensory  processing  on  an  autonomous  vehicle. 

Given  that  our  database  includes  a  general  description  of 
the  terrain  and  the  major  objects  in  the  area,  we  can  use 
the  generic  iconic  models  of  the  objects  to  construct  the 
image  that  the  sensor  expects  to  see.  One  task  that  must 
be  performed  is  to  confirm  that  the  expected  objects  are 
present.  This  is  model-based  vision,  but  it  differs  from  the 
conventional  model-based  approach  in  that  the  models  we 
have  are  only  generic. 

The  procedure  for  verifying  models  must  he  ahle  to  match 
properties  and  parameters  of  this  model  to  the  data,  rather 
than  use  template-based  matching  of  intensity  images.  Of 
eourse  we  will  not  be  ahle  to  generate  all  the  ohjeets  in 
the  scene;  some,  such  as  unexpected  ohjeets,  will  not  he  in 
the  initial  database.  Here  a  bottom-up  approach  is  needed. 
However,  as  we  verify  the  existence  of  some  objects,  tasks 
like  image  segmentation  will  be  focussed  on  those  areas  that 
•till  must  be  explained. 

Even  the  approach  taken  to  a  low-level  procedure  like  seg¬ 
mentation  is  altered  by  the  availability  of  terrain  data.  That 
data  can  be  used  to  break  the  interdependence  of  surface 
alope  and  surface  albedo  in  determining  image  intensity,  and 
hence  can  allow  segmentation  of  the  intensity  image  to  be 
replaced  hy  segmentation  of  the  albedo  image.  In  addition, 
texture*  seen  in  previous  images  can  directly  influence  the 
manner  in  which  segmentation  proceeds  in  the  new  image. 
Stored  knowledge,  in  this  case  terrain  data,  can  even  be 
used  to  correct  the  textures  for  perspective  distortion.  The 
results  obtained  by  algorithms  for  low-level  processes,  like 
segmentation,  are  now  as  much  a  function  of  the  database 
knowledge  as  the  signal. 

The  correct  decomposition  of  the  job  of  interpreting  sen¬ 
sory  data  into  tasks  is  an  open  research  issue,  but  one  that 
will  be  greatly  influenced  by  the  availability  of  stored  knowl¬ 
edge.  In  a  similar  fashion,  bigber-level  reasoning  tasks  are 
influenced  hy  the  increasing  competence  of  low-level  pro¬ 
cesses  to  provide  information  in  summary  form.  The  inte¬ 
gration  of  higher-level  reasoning  and  low-level  signal  pro¬ 
cessing  should  not  be  achieved  by  resorting  to  mechanisms 
in  which  bigher-level  processes  reason  about  image  features; 
high-level  processes  should  use  the  qualitative  descriptors 
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Figure  5:  Database  Display.  Generic  models  are  used  to 
display  the  current  contents  of  the  database.  Sketch  maps 
designed  with  the  same  tool  are  used  to  put  information  into 
the  database. 

that  intermediate-level  vision  can  assemble  from  low-level 
signal  processing  and  stored  knowledge. 


9  Experimental  Environment 

Any  project  that  involves  substantial  software  development 
makes  use  of  available  software  tools  and  builds  others  where 
they  do  not  already  exist  (or  are  unattainable).  We  have 
mentioned  using  a  graph  manipulation  package  to  build  tbe 
semantic  network,  and  various  processes  have  made  use  of 
existing  image-processing,  graphics,  and  three-dimensional 
modeling  packages.  Our  system  is  built  in  Lisp  and  makes 
extensive  use  of  the  flexibility  a  Lisp  environment  can  pro¬ 
vide.  The  system  described  runs  on  the  Symbolics  3600 
family  of  Lisp  machines.  All  system-huilding  tools  can  be 
executed  as  independent  processes  that  run  simultaneously 
with  the  processes  that  manipulate  data  and  carry  out  rea¬ 
soning  activities.  They  provide  an  interactive  environment 
in  which  to  experiment.  An  example  of  the  flexibility  such 
an  environment  can  provide  is  shown  in  Figure  5,  which  dis¬ 
plays  a  portion  of  tbe  database.  Because  we  must  be  able 
to  determine  the  current  state  of  the  datahase  as  process¬ 
ing  proceeds  and  because  most  of  the  data  eontained  in  it 
descrihe  spatial  information,  we  choose  to  display  it  pictori* 
ally.  As  some  data  tokens  may  only  be  labeled  with  general 
terms,  such  as  immovable. object,  we  display  generic  iconic 
models  of  tbe  data  tokens.  Entering  spatial  data  into  the 
database,  particularly  more  qualitative  data  like  a  sketch 
map,  is  made  easy  with  such  a  tool.  We  create  a  display  us¬ 
ing  generic  models  and  place  the  sketch-map  data  into  the 
database. 

Tools  that  lessen  the  effort  needed  to  build  systems  make 
it  possible  to  embark  on  tbe  experiments  that  require  a  com¬ 
plete  system  to  be  in  place  before  the  simplest  trial  can  com¬ 
mence.  Only  through  experimentation  with  real  data  on  a 
running  system  can  competence  he  fairly  evaluated. 


10  Summary 

The  natural,  outdoor  environment  in  which  an  autonomous 
land  vehicle  operates  imposes  substantial  obstacles  to  the  in¬ 
tegration  of  the  vehicle's  various  sensory,  planning,  naviga¬ 
tional,  and  control  activities.  Tbe  complexity  or  the  domain 
and  the  requirement  for  high  reliability  rule  out  approaches 
that  do  not  make  substantial  use  of  stored  knowledge  about 
the  environment.  An  intelligent  database  that  competently 
contributes  to  the  processes  that  perform  these  various  ac¬ 
tivities  is  central  to  tbe  overall  design  of  an  autonomous 
system. 

Today’s  technology  is  not  capable  or  directly  integrating 
sensory  information  with  stored  knowledge  in  one  step.  To 
cope  with  the  irregularities  and  imperfections  of  the  outdoor 
world,  a  series  of  interactions  is  needed  to  reach  tentative 
conclusions  that  constrain  the  final  outcome.  For  this  rea¬ 
son,  our  software  architecture  for  sensory  integration  is  a 
community  of  interacting  processes,  each  of  which  has  its 
own  limited  goals  and  expertise,  but  all  of  which  cooperate 
to  achieve  tbe  higher  goals  of  the  system. 

Processes  must  be  able  to  take  advantage  of  relevant 
knowledge  that  may  be  available.  The  design  of  a  knowl¬ 
edge  system  must  include  a  means  for  effectively  commu¬ 
nicating  semantic  information  to  the  multiple  and  varied 
processes  that  wish  to  consider  it.  A  vocabulary  of  terms 
and  a  set  of  connections  among  them  sene  this  purpose  in 
our  system.  The  vocabulary  consists  of  a  domain-specific 
set  of  terms  that  have  been  identified  as  being  both  useful 
for,  and  instantiate  by,  the  computational  processes.  A  se¬ 
mantic  network  is  used  to  encode  the  specialization  lattice 
of  the  concepts  and  the  physical  decomposition  of  composite 
objects. 

The  database  architecture  for  an  autonomous  system 
must  allow  multiple  representations  of  world  data.  It  must 
support  quantitative  and  qualitative,  inconsistent  and  ap¬ 
proximate  data  at  multiple  levels  of  resolution.  Our  archi¬ 
tecture  is  based  on  spatial  and  semantic  directories  that  or¬ 
ganizes  tbe  various  representations  of  the  knowledge  to  allow 
for  focussed  processing,  for  flexibility  of  access,  for  modular¬ 
ity  of  task  processing,  and  for  asynchronous  process  control. 
The  directories  both  link  information  stored  in  the  different 
representations  and  encode  the  relationships  among  objects. 
They  permit  the  achievement  of  partial  data  consistency,  as 
required  for  the  task  at  hand,  rather  than  complete  consis¬ 
tency  of  all  data,  relevant  or  not.  In  a  departure  from  the 
traditional  strategy  of  resolving  all  conflicts  at  the  time  of 
insertion,  our  knowledge  system  defers  this  chore  until  the 
data  are  required  —  when  the  relation  of  the  information 
to  a  task  is  known,  and  when  more  data  are  likely  to  be 
available. 

The  autonomous  system  that  we  descrihe  consists  of  a 
community  of  interacting  processes  that  attempt  to  cooper¬ 
ate  in  achieving  the  goals  of  the  system.  The  database  is 
an  active  participant  in  the  system,  not  merely  a  data  store, 
merging  the  functional  aspects  of  process  control  and  data 
organization. 
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